Zymogenic latency in an ∼250-million-year-old astacin metallopeptidase

The horseshoe crab Limulus polyphemus is an ancient chelicerate that is a model organism for the study of the evolution and function of peptidases. It contains a member of the astacin metallopeptidases, the mechanism of latency of which was revealed by X-ray structural analysis.

Despite their name, horseshoe crabs are actually not crustaceans but chelicerates that are phylogenetically closer to spiders, ticks and scorpions than to crabs (Lankester, 1881;Ballesteros & Sharma, 2019). L. polyphemus has a remarkable estimated life expectancy of up to 20 years (Walls et al., 2002) and is frequently used as a laboratory animal model to study its compound eyes, its simple nervous system and marine invertebrate embryology in general (Smith, 2022). Moreover, it possesses an ancient and primitive proteolytic bloodcoagulation and innate immunity system, which is the only one found outside vertebrates (Rowley et al., 1984;Doolittle, 2010;Schmid et al., 2019;Winter et al., 2020;Eleftherianos et al., 2021). Thus, L. polyphemus is an important organism for study of the evolution and function of peptidases (Becker-Pauly et al., 2009).
Astacins share a basic domain architecture consisting of an N-terminal signal peptide for secretion, a pro-peptide (PP) of variable length (from 34 residues in astacin to 486 residues in Drosophila melanogaster tolkin; Finelli et al., 1995;Gomis-Rü th, Trillo-Muyo et al., 2012;Arolas et al., 2018) for zymogenic latency and the CD (Gomis-Rü th, Trillo-Muyo et al., 2012). This core may be C-terminally extended by disparate modules, among which are linkers (LNK), CUB domains (found in the complement component C1r/1s, the embryonic sea urchin Uegf and bone morphogenetic protein 1; Bork & Beckmann, 1993;PF00431) and MAM domains (common to meprins, A5 receptor protein and tyrosine phosphatase ; Cismasiu et al., 2004;PF00629). Two astacins, namely a short 240-residue protein (astl gene; UniProt accession B4F319) and a long 403-residue protein (astl-mam gene; UniProt B4F320), were identified in L. polyphemus, recombinantly expressed and biochemically characterized (Becker-Pauly et al., 2009. The short form was predominantly found in the eyes and the brain, which suggests a function in the nervous system, while the long form was ubiquitous (Becker-Pauly et al., 2009). The short paralogue has the basic domain architecture of the family, while the long paralogue further contains an LNK and an MAM domain. Both astacins share 46% sequence identity within the PP and the CD, and their trypsin-activated forms showed proteolytic activity in gelatin zymography and in solution against azocasein and the extracellular matrix proteins fibronectin, type IV collagen, gelatin and laminin, but not triple-helical collagen (Becker-Pauly et al., 2009). Finally, consistent with the horseshoe crab being a chelicerate, these astacins were found to be closer to an orthologue from the brown spider Loxoceles intermedia in a phylogenetic analysis than to the crustacean orthologs from the crayfish A. astacus and the shrimp Panaeus vannamei (Becker-Pauly et al., 2009).
Here, we crystallized the zymogen of the long paralogue, hereafter referred to as pLAST-MAM, and solved its crystal structure. Our results provide structural and molecular insight into the latency mechanism of the currently evolutionarily oldest holozoan astacin.

Protein crystallization
The pLAST-MAM zymogen was obtained by recombinant expression in Trichoplusia ni High Five insect cells, purified as described in Becker-Pauly et al. (2009) and subsequently concentrated in a Vivaspin device using a polyethersulfone membrane with 10 kDa cutoff (Vivaproducts). We screened for crystallization conditions using the sitting-drop vapourdiffusion method at the joint IBMB/IRB Automated Crystallography Platform (https://www.ibmb.csic.es/en/facilities/ automated-crystallographic-platform). Reservoir solutions were prepared using a Tecan Freedom EVO robot and were dispensed into 96 Â 2-well MRC plates (Innovadyne Technologies). A Phoenix/RE robot (Art Robbins) administered crystallization nanodrops consisting of 100 nl each of protein and reservoir solution. Crystallization plates were subsequently incubated at 4 or 20 C in Bruker steady-temperature crystal farms. Successful initial conditions were refined and scaled up to the microlitre range in 24-well Cryschem crystallization dishes (Hampton Research) whenever possible. Optimal crystals of the protein at $7 mg ml À1 in 50 mM HEPES pH 7.0 were obtained at 20 C using 0.1 M bicine pH 9.0, 10% polyethylene glycol (PEG) 40 000, 2% dioxane as the reservoir solution. Crystals were thin and fragile rectangular plates, which were harvested using cryo-loops (Molecular Dimensions), rapidly passed through a cryo-buffer consisting of reservoir solution plus 20%(v/v) glycerol and flash-vitrified in liquid nitrogen for transport and data collection.

Diffraction data collection and processing
X-ray diffraction data were collected on 18 April 2010 using an ADSC Quantum 315r detector on beamline ID29 of the research papers ESRF synchrotron, Grenoble, France. Diffraction data were processed using XDS (Kabsch, 2010) and XSCALE, and were transformed to MTZ format using XDSCONV for use with the Phenix (Liebschner et al., 2019) and CCP4 (Winn et al., 2011) suites. Analysis with phenix.xtriage within Phenix revealed an absence of translational noncrystallographic symmetry (NCS) and no significant twinning according to the L-test. The crystals contained two monomers in the asymmetric unit and Table 1 provides essential statistics on data collection and processing.

Structure solution and refinement
The structure of pLAST-MAM was solved by molecular replacement using the Phaser crystallographic software (McCoy et al., 2007) and a homology model for the CD and MAM domain predicted with AlphaFold . After several trials, we could only obtain correct solutions by searching with the domains separately, i.e. two for the CD but only one for the MAM domain. Those for the CD corresponded to Eulerian angles of = 54.2, = 54.0, = 116.3 and cell-fraction translation values of x = 0.106, y = 0.002, z = 0.210 for one protomer and = 261.5, = 125.5, = 297.0, x = 0.419, y = 0.884, z = 0.303 for the second protomer. The corresponding values for the MAM moiety were = 64.8, = 106.0, = 172.9, x = 0.285, y = 0.751, z = 0.991. These solutions had a final translation-function Z-score of 17.1 and a global loglikelihood gain after refinement of 782.
The suitably rotated and translated molecules were subjected to the phenix.autobuild protocol (Terwilliger et al., 2008) within Phenix, which yielded a greatly improved Fourier map for manual model building with Coot (Casañ al et al., 2020). The latter alternated with crystallographic refinement using the phenix.refine protocol (van Zundert et al., 2021) and BUSTER (Smart et al., 2012), which both included translation/ liberation/screw motion and NCS restraints, until completion of the model. The latter comprised residues Glu22-Cys403 of protomer A and Glu22-Gly246 of protomer B, each with a catalytic zinc ion plus one tentatively assigned magnesium cation, one diethylene glycol molecule, one triethylene glycol molecule, two glycerol molecules and 229 solvent molecules. The occupancy of LNK and MAM of protomer A refined to 87%. Table 1 provides essential statistics on the final refined model, which was validated through the wwPDB validation service (https://validate-rcsb-1.wwpdb.org/validservice). The coordinates can be retrieved from the Protein Data Bank (https://www.wwpdb.org/) as entry 8a28.

Miscellaneous
Structure superpositions were performed with SSM (Krissinel & Henrick, 2004) within Coot. Figures were prepared using UCSF Chimera (Goddard et al., 2018). Protein interfaces and intermolecular interactions were analysed using PDBe-PISA (https://www.ebi.ac.uk/pdbe/pisa; Krissinel & Henrick, 2007) and verified by visual inspection. For this, the interacting surface of a complex was taken as half of the sum of the buried surface areas of either molecule.

Overall crystal arrangement
To prevent autolysis, pLAST-MAM was recombinantly expressed in insect cells as a point mutant in which the general base/acid glutamate for catalysis (Arolas et al., 2018;E 140 ; residues are given as single-letter codes with numbering in superscript according to UniProt B4F320; other proteins are numbered in subscript) was replaced by alanine to create a catalytically impaired variant. This strategy has often been employed in the past to prevent autolysis when crystallizing MP zymogens (see Table 1 in Arolas et al., 2018). pLAST-MAM crystals with two protomers (A and B) in the crystallographic asymmetric unit were obtained in 2010 (Table 1) Table 1 Crystallographic data.
Abbreviations: AU, crystallographic asymmetric unit; GOL, glycerol; PEG, diethylene glycol; PGE, triethylene glycol; RSRZ, real-space R-value Z-score. Values in parentheses are for the outermost resolution shell.   CDs showed average thermal displacement parameters (B factors) of 60 and 73 Å 2 for protomers A and B, respectively, the segment spanning LNK and MAM of protomer A had an average B factor of 116 Å 2 after occupancy refinement to 87%. Inspection of the crystal packing revealed that the two CDs form tight layers parallel to the xy plane of the crystal with their respective crystallographic symmetry mates (1 and 2 in Figs. 1b and 1c). They are in a relative upside-down conformation, so that the C-termini protrude either above or below the CD layer. In the case of the A protomers, LNK and MAM project into the space between CD sections and make interactions with symmetric MAM and LNK moieties from the CD layer beneath, respectively, which are required to form the crystal (Figs. 1b and 1c). In contrast, the space between CD sections into which the C-termini of the B-protomer CDs point (sections 2 and 3 in Fig. 1d) does not contain any atoms and thus lacks crystal contacts owing to the missing LNKs and MAMs. However, when superposing the full-length protomer A on protomer B by their respective CDs, the LNK and MAM moieties adopt a very similar arrangement in the space between the two CD layers to that seen in the A protomers (sections 2 and 3 in Fig. 1e). Thus, LNK and MAM of the B protomers must also be present in the crystal to establish the intermolecular contacts necessary to build the crystal. Overall, we conclude that while both LNK-MAM moieties are very flexible and adopt several slightly different orientations that are able to assemble the crystal, those of protomer A are somewhat more rigid, so they are grossly defined in the final Fourier maps. In contrast, those of protomer B are so flexible that the density is too poor to confidently place them.
Thus, given the poor definition of the MAM domains, we will concentrate the discussion hereafter on the PP and CD moieties of the zymogen (referred to here as pLAST) and the mature CD (LAST) of protomer A, and the mechanism of latency in the context of other structurally characterized astacin zymogens. Suffice to say that the predicted structure of the MAM domain of pLAST-MAM is very similar to that of the human astacin-family member meprin except for some loops (Fig. 1f ). For a discussion of the architecture and features of these domains, please refer to Cismasiu et al.   (Fig. 2a). The PP runs along the front surface of pLAST from right to left and features helix 1 on the primed side of the cleft (substrate and active-site subsite terminology based on Schechter & Berger, 1967;Gomis-Rü th, Botelho et al., 2012). It adopts a wide loop structure protruding from the cleft between L 29 and D 38 (Fig. 3a), which is stabilized by two intramain-chain hydrogen bonds ( As in other astacins, the 195-residue CD divides into an NTS and a CTS of approximately equal size (Fig. 2a). The NTS is rich in regular secondary structure and consists of a five-stranded arched and twisted -sheet (1-5), the strands of which parallel the active-site cleft except for the lowermost (4), which is antiparallel and frames the upper rim of the cleft. The concave face of the sheet accommodates three helices (3-5), among which are a 'backing helix' (4) and an 'active-site helix' (5) that are characteristic of astacins and metzincins in general Stö cker et al., 1993;Gomis-Rü th, 2009; Gomis-Rü th, Trillo-Muyo et al., 2012; Cerdà -Costa & Gomis-Rü th, 2014; Arolas et al., 2018). The active-site helix encompasses the first two-thirds of a conserved zinc-binding motif (H 139 -E-X-X-H-X-X-G-X-X-H 149 in pLAST) found in astacins and other metzincins, which features three metal-binding histidines and the general base/acid glutamate, here replaced with an alanine (see above and Fig. 2c). At the glycine of the motif (G 146 ), the polypeptide undergoes a sharp downwards turn to enter the CTS, which in contrast to the NTS is more irregular. It contains two short helices (6 and 7) and the short -ribbon 67 in addition to a 'C-terminal helix' (8), which again is characteristic of metzincins. Of note is another conserved structural element of metzincins, the 'Met-turn', which is a tight 1,4-turn (S 194 -L 197 ) encompassing the strictly conserved M 196 (Fig. 2c). Its side chain provides a hydrophobic pillow for the metal-binding site that is essential for the stability and function of metzincins (Tallant, García-Castellanos et al., 2010). Immediately downstream of this methionine, Y 198 provides the fourth zinc ligand of the CD through its somewhat more distant O atom. In other astacins, this residue is swung out upon substrate binding following a 'tyrosine switch' and its O atom participates in stabilization of the reaction intermediate during catalysis (Stö cker & Yiallouros, 2013). Finally, a disulfide bond links the back of the NTS with the C-terminal helix 8 of the CTS (C 90 -C 244 ) and a second one links strand 4 with the loop connecting 5 and 5 (L55) (C 112 -C 131 ) (Fig. 2a).

Mechanism of latency
Latency is achieved in pLAST by blocking access of substrates through the PP, which runs across the active-site cleft of the CD moiety in the opposite direction to a substrate (Figs. 2a, 2b and  3a). This is a strategy to prevent untimely autolytic cleavage in cis (Khan & James, 1998;Arolas et al., 2018). In addition, the polypeptide chain does not adopt an extended conformation as required for substrates to be cleaved  Active-site cleft details and proposed activation mechanism of pLAST. (a) Close-up view of Fig. 2(a) in stereo depicting residues engaged in the PP-CD interaction as sticks with C atoms in green (PP; blue labels) or plum (CD; red labels). The labels of the residues shown in Fig. 2(c) have not been included for clarity. (b) Superposition in stereo of the C traces of the experimental structure of pLAST (in tan for the CD moiety and semi-transparent aquamarine for the PP) and the AlphaFold homology model of LAST (in salmon) to illustrate the proposed activation mechanism. Small differences are found in segments G 199 -D 206 (1) and E 150 -E 179 (2) owing to a closing motion that slightly narrows the cleft. Large differences are encountered for the 'activation segment' (3; P 180 -N 187 ) and the first seven residues of the mature CD (4; N 49 -L 56 ). Green arrows pinpoint the proposed movements upon maturation. (Tyndall et al., 2005) but rather the aforementioned loop structure protrudes from the cleft (Fig. 2b). This prevents a scissile bond from extending across cleft subsites S 1 and S 0 1 (Fig. 3a), which is another mechanism to prevent undesired cleavage (Arolas et al., 2018). The surface occluded by the PP-CD interaction spans 1207 Å 2 , which is in the range reported for protein-protein complexes ($380-3390 Å 2 ; Chen et al., 2013), and has a solvation free-energy gain upon interface formation (Á i G) of À16.9 kcal mol À1 (Krissinel & Henrick, 2007), indicating a strong interaction. Participating structural elements include the entire PP and segments N 49 -V 52 , D 110 -V 116 , Y 129 -H 143 , W 148 -N 152 , S 170 -M 178 , Y 198 -T 208 and P 223 -K 226 of the CD, with the establishment of 20 electrostatic interactions and hydrophobic contacts between 17 pairs of residues of either moiety ( Table 2).
The primary activation site of pLAST (K 48 -N 49 ) is inserted within short helix 2 and buried in the zymogen, thus preventing access by activating enzymes in a similar fashion as found in pro-astacin (Guevara et al., 2010). Moreover, K 48 N makes strong interactions with Y 173 O (2.7 Å apart) and N 176 O (3.1 Å ) of the CD and with E 36 O "2 (2.7 Å ) of the PP motif, which likewise hinder activation. The latter interaction is reminiscent of the double salt bridge between an arginine and an aspartate in a PP motif found in matrix metallopeptidase (MMP) zymogens (P-R-C-G-X-P-D; Springman et al., 1990;Tallant, Marrero et al., 2010;Arolas et al., 2018). Moreover, the activation-scissilebond N atom is bound to D 47 O 2 (2.8 Å ) within the PP, so the activation site is additionally protected in the zymogen. All of these findings support the maturation of pLAST requiring partial unfolding of the segment flanking the activation site and/or preliminary cleavages, as described for crayfish astacin (Yiallouros et al., 2002;Guevara et al., 2010).
The most relevant element for latency is D 38 , which binds the catalytic zinc in a bidentate manner through its O 1 (2.2 Å ) and O 2 (2.4 Å ) atoms (Fig. 2c), thus replacing the catalytic solvent required for catalysis in mature MPs (Arolas et al., 2018). This aspartate is embedded in the PP motif and contributes to a distorted octahedral metal coordination sphere together with H 139 N "2 (2.1 Å ) and H 149 N "2 (2.1 Å ) in plane with the cation and with H 143 N "2 (2.1 Å ) and Y 198 O (3.3 Å ) in the apical positions. Thus, D 38 functions as an 'aspartate switch' for latency maintenance as described previously for crayfish astacin (Guevara et al., 2010) and human meprin (Arolas et al., 2012) within the astacins (see below) and for fragilysin-3 (Goulas et al., 2011)

Proposed mechanism of activation
The archetypal astacin from crayfish, which like the horseshoe crab is an arthropod, represents the evolutionarily closest orthologue of LAST with a known mature structure . Indeed, 157 C atoms from these proteins superpose with a core root-mean-square deviation (r.m.s.d.) of 1.3 Å (38% sequence identity). Moreover, a predicted homology model of LAST was obtained with AlphaFold , which showed most of the common features in relevant segments described for mature astacin. It had an average predicted local distance difference test (pLDDT) value of >97, which is indicative of high reliability . Thus, this model is taken hereafter as a working model of mature Limulus astacin.
Superposition of the pLAST structure and the LAST model (Fig. 3b) reveals that the CD moieties mostly coincide. In particular, the NTSs match best, with an r.m.s.d. of 0.93 Å for all 746 atoms of segment L 57 -H 149 . The metal-binding site and most of the active-site cleft would largely be preformed in the zymogen, as observed for other MP zymogens (Arolas et al., 2018). Within the CTS, good agreement is observed for the segment E 188 -G 199 , which includes the Met-turn, and the entire C-terminal stretch from G 207 to C 244 . Loop G 199 -D 206 , which frames the lower rim of the cleft, slightly deviates, with a maximal displacement of $2 Å that closes the cleft on the primed side upon activation. On the bottom of the nonprimed side of the cleft, E 150 -E 179 would additionally undergo a  Table 2 Interactions between the pro-peptide (PP) and the catalytic domain (CD) of pLAST protomer A. closing motion of maximally $3 Å facilitated by a $10 rotation around W 198 . The largest deviation, however, is observed for the segment P 180 -N 187 , which conforms to a flexible 'activation domain' and would become significantly rearranged (Fig. 3b), as described for other astacins (Guevara et al., 2010) and the otherwise unrelated trypsin-like serine endopeptidases (Huber & Bode, 1978). This rearrangement would result from the displacement of N 49 -L 56 , which upon maturation cleavage at K 48 -N 49 would become rotated outwards around the C -C bond of L 56 . In this way, the seven preceding residues would be amply repositioned by up to $11 Å and penetrate the mature enzyme moiety, so the first three residues (N 49 -A 50 -I 51 ) would be completely inaccessible to solvent, as reported for meprin (see Section 3.5). Next, N 49 would bind the 'family-specific residue' immediately after the third zinc-binding histidine (E 150 ; Bode et al., 1993;Gomis-Rü th, 2003), which in turn is held in place by internal salt bridges with R 237 and R 153 in the zymogen. This interaction could occur directly through the N 49 N 2 atom, as observed in meprin (Arolas et al., 2012). An alternative interaction through the -amino group (N 49 N) mediated by a solvent molecule, as observed in crayfish astacin , is also conceivable. Moreover, the N 49 O 1 atom might also bind the R 237 side chain. Overall, this scenario of a deeply buried mature N-terminus is very similar to that found in other astacins, in which the maturation mechanism has been structurally verified (see Section 3.5). This, in turn, provides confidence in the reliability of the LAST homology model.

Comparison with other astacin latency mechanisms
To date, the crystal structures of crayfish pro-astacin (PDB entry 3lq0; Guevara et al., 2010), human pro-meprin (PDB entry 4gwm; Arolas et al., 2012) and pro-myroilysin from two closely related bacterial species, Myroides profundi (PDB entry 5czw; Xu et al., 2017) and Myroides sp. CSLB8 (PDB entry 5gwd; Xu et al., 2017), have been reported, as well as their respective mature forms astacin (PDB entry 1ast; Gomis-Rü th et al., 1993), meprin (PDB entry 4gwn; Arolas et al., 2012) and myroilysin from Myroides sp. CSLB8 (PDB entry 5zjk; Ran et al., 2020). The two proteins from Myroides are 99.6% identical, so only that from Myroides sp. CSLB8 will be discussed here. Of all these structures, only promeprin spans additional domains downstream of the CD, namely an MAM and a TRAF domain (Arolas et al., 2012). Pictures of the three zymogens superposed onto the mature forms, together with those of the pLAST structure and the LAST model, are provided in Figs. 4(a)-4(d).
In all cases, the mature N-terminus is buried inside the catalytic moiety and is bound to the family-specific glutamate of astacins either directly through an N-terminal asparagine (LAST and meprin ) or glycine (myroilysin) or mediated by a solvent molecule because the N-terminal segment is one residue shorter (astacin). The position of the new N-terminus in the zymogen and the mature moiety is very close in astacin ($2 Å ; Fig. 4b), quite close in meprin ($6 Å ; Fig. 4c), farther apart in LAST ($11 Å ; Fig. 4a) and farthest in myroilysin ($17 Å ; Fig. 4d).
Detailed analysis of the four zymogen-mature enzyme pairs reveals that in all cases the PP is poor in regular secondary structure and adopts a mostly extended conformation that traverses the active-site cleft in the opposite direction to a substrate. In pro-myroilysin it is additionally elongated at the N-terminus and further extends along the front surface of the NTS (Fig. 4d), while in pro-meprin (Fig. 4c) it runs in an extended conformation along a neighbouring TRAF domain on the right of the CD (not shown). In all cases, CTS regions framing the bottom of the active-site cleft on its nonprimed side constitute activation segments that undergo rearrangement upon maturation cleavage and repositioning of the new N-terminus. In astacin, only this activation segment (I 130 -E 139 , mature enzyme numbering according to PDB entry 1ast; add 49 for full-gene numbering; see UniProt P07584) is reorganized, while the rest of the molecule is preformed in the zymogen (Guevara et al., 2010;Fig. 4b). Next, LAST is most likely to undergo slight rearrangement of two segments (G 199 -D 206 and E 150 -E 179 ) in addition to the major movement of the activation segment (P 180 -N 187 ; see Section 3.4 and Fig. 4a). Meprin , in turn, repositions most of its CTS (Q 164 -Y 211 and L 199 -D 233 according to UniProt Q16820; segment D 194 -L 199 is disordered in the zymogen structure) in a concerted hinge motion that entirely closes the cleft at its bottom in response to maturation (Fig. 4c). Finally, the largest deviation is observed in myroilysin, which rearranges its entire CTS except for the Met-turn and the C-terminal helix (Fig. 4d). The segments affected are Q 155 -A 201 and Y 210 -N 225 (myroilysin numbering according to PDB entry 5gwd; see also UniProt A0A0P0DZ84). A large flap (N 160 -S 193 ), which encompasses two helices, is folded back on top of the active-site cleft and traps the PP in the zymogen. Upon maturation, this flap is rotated to the right with a maximal displacement of $17 Å (measured at P 176 ), thus liberating access to the cleft (Fig. 4d).
Differences are also found in the residues blocking the zinc ion in the zymogen. The three metazoan proteins contain an aspartate within the PP motif, which is structurally conserved (Fig. 4e), acting as an aspartate switch. In contrast, the bacterial enzyme lacks the PP motif and instead features a cysteine, which blocks the zinc according to a 'cysteine-switch' mechanism (Ran et al., 2020;Xu et al., 2017). Moreover, the polypeptide chain flanking the cysteine is in a canonical, extended conformation and does not adopt the loop of the PP motif. Overall, this is inversely reminiscent of MMPs, in which canonical vertebrate orthologues regulate latency according to a cysteine-switch mechanism (Springman et al., 1990;Rosenblum et al., 2007;Tallant, Marrero et al., 2010;Arolas et al., 2018), while the bacterial orthologue karilysin from the periodontopathogen Tannerella forsythia instead operates according to an asparate switch. As in astacins, MMPs are only found dispersedly outside animals, and it has been proposed that karilysin is a xenologue coopted from a mammalian host through horizontal gene transfer facilitated by intimate interaction between the host and the colonizing bacterium (Cerdà -Costa et al., 2011). A research papers similar origin is conceivable for myroilysin within astacins given that Myroides spp. have been reported in several human body fluids and can trigger infection leading to soft-tissue infections in humans (Maraki et al., 2012)  Activation of astacins with reported zymogen structures and a conserved PP motif. (a)-(d) Superposition in cross-eyed stereo of the C traces in standard orientation of the latent and mature forms of (a) Limulus astacin (latent, PDB entry 8a28; mature, AlphaFold model), (b) crayfish astacin [latent, PDB entry 3lq0 (Guevara et al., 2010); mature, PDB entry 1ast Gomis-Rü th et al., 1993)], (c) human meprin [latent, PDB entry 4gwm (Arolas et al., 2012); mature, PDB entry 4gwn (Arolas et al., 2012)] and (d) Myroides sp. CSLBB myroilysin [latent, PDB entry 5gwd (Xu et al., 2017); mature, PDB entry 5zjk (Ran et al., 2020)]. The mature forms are in orange and the zymogens are in cyan (PP) and yellow (CD). The catalytic zinc ions are depicted as purple spheres. The PP of meprin is N-terminally extended and runs across the front surface of a vicinal TRAF domain (not shown; Arolas et al., 2012). The most relevant rearranged segments during maturation cleavage, the 'activation segment' and the mature N-terminal segment, are pinpointed by green and red stars in each structure, respectively. (e) Superposition of the segments encompassing the PP motif of astacins (F-E-G-D-I) in pLAST (C atoms in cyan), crayfish pro-astacin (C atoms in tan) and human pro-meprin (C atoms in plum). Myroilysin lacks this motif. diabetic patient (Endicott-Yazdani et al., 2015). Thus, as in MMPs, the latency mechanisms of holozoan orthologues and bacterial xenologues would also diverge in astacins.

Data availability
All data and reagents are freely available from the authors upon reasonable request.